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Abstract 

Site-specific incorporation of bioorthogonal unnatural amino acids into proteins provides a useful tool for the installation of 
specific functionalities that will allow for the labeling of proteins with virtually any probe. We demonstrate the genetic 
encoding of a set of alkene lysines using the orthogonal PylRS/PylT C uA pair in Escherichia coli. The installed double bond 
functionality was then applied in a photoinitiated thiol-ene reaction of the protein with a fluorescent thiol-bearing probe, as 
well as a cysteine residue of a second protein, showing the applicability of this approach in the formation of heterogeneous 
non-linear fused proteins. 
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Introduction 

Covalent attachment of proteins to ligands, polymers, and 
surfaces creates macromolecules combining specific biological 
function with favorable physical and chemical properties. For 
example, studying biological processes in their native environment 
often requires the addition of reporter tags to proteins [1]. To date, 
the mainstay tagging strategy for imaging of proteins involves 
genetic fusions of fluorescent proteins [2,3]. However, the large 
size of fluorescent proteins can interfere with the folding and 
activity of the targeted protein [4]. Alternatively, tag-mediated 
labeling methods have been exploited, including self-labeling 
proteins, such as HaloTag, SNAP-tag, CLIP-tag, and enzyme- 
mediated labeling [5,6]. Although these methods allow for smaller 
reporter tags, limitations with regard to the position and the 
structure of the label remain and the presence of an enzyme is 
required. 

An alternative strategy to label proteins is via the introduction of 
a single-residue modification, which is nearly non-perturbing. Site- 
specific protein labeling can be achieved by the installation of tags 
through bioconjugation reactions with reactive handles previously 
installed in a protein by using an orthogonal aminoacyl-tRNA 
synthetase/ aminoacyl-tRNA pair for unnatural amino acid (UAA) 
mutagenesis [7-10]. The bioorthogonal groups can be installed at 
virtually any position at the protein expressed in pro- and 
eukaryotic cells and the choice of probes is nearly limitless. 
Bioorthogonal chemical handles that have been genetically 
encoded for conjugation reactions include ketones [1 1-14], azides 
[15-20], alkenes [21-28], alkynes [24,29-32], tetrazines [33], aryl 



halides [34,35], and aryl boronates [36]. The alkene functionality 
is currently receiving considerable attention due to its versatility in 
organic transformations and it is rarely found in natural proteins 
[37,38], allowing for selective modification. Carbon-carbon 
double bonds have been exploited for protein modification in 
reactions including olefin-metathesis [26,39], photoaddition of 
tetrazoles [25,27,28], inverse electron demand Diels-Alder cyclo- 
additions [23,24,30], and thiol-ene reactions [22,40-44]. 

Bioorthogonal reactions have been applied in a variety of site- 
specific modifications of proteins such as fluorescent labeling, 
PEGylation, biotinylation, post-translational modification mimics, 
and surface immobilization [7,9,45-47]. Another area of interest 
for which bioconjugation reactions have been explored is in the 
generation of non-linear protein fusions. In biological systems, 
proteins often bind to other proteins to gain stability, affinity and 
higher specificity to perform specific cellular functions such as 
signal transduction, transcriptional regulation, and DNA repair 
[48-50]. Elucidation of many of these processes have led to the 
generation of chemical and biosynthetic methods to create non- 
linear protein linkages post-translationally for the control and 
performance of a number of functions, as well as protein 
trafficking and isolation. Methods that have been explored include 
native chemical ligation [51-54], enzyme based strategies [55,56], 
and conjugation employing reactions with UAA residues [57-63], 
The introduction of UAAs at a specific position allows for greater 
topological diversity with minimal protein modification [7-9,46]. 
Here, we are applying the site-specific genetic incorporation of 
alkenes into proteins in the direct, spacer-free generation of non- 
linear protein fusions. 
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The thiol-ene reaction involves a radical-mediated addition of a 
thiol to an alkene that occurs upon UV irradiation (365-405 nm) 
[64,65]. The reaction offers the possibility of using light to control 
both in space and time the formation of a stable thioether bond. As 
a result of its specificity for alkenes and compatibility with aqueous 
environments, the thiol-ene reaction is a bioorthogonal reaction 
that has been applied in polymer and material synthesis [66-72], 
carbohydrate modification [73,74], and peptide and protein 
modification [21,22,29,40-44]. Recently, orthogonal thiol-ene 
bioconjugations applying alkenyl UAAs and synthetic organic 
reaction partners have been reported [21,22]. In order to expand 
the chemical diversity of these orthogonal handles, we demon- 
strate the synthesis, incorporation and protein heterodimer 
formation using alternative thiol-ene reaction conditions. 

Results and Discussion 

Incorporation of alkene lysines into proteins 

The pyrrolysyl-tRNA synthetase (PylRS), found in certain 
methanogenic archaea and bacteria, direcdy charges pyrrolysine 
(Pyl) onto its cognate tRNA that subsequently delivers it in 
response to an in-frame amber stop codon, TAG [75-77]. 
Furthermore, it has been demonstrated that the pyrrolysyl-tRNA 
synthetase/pyrrolysyl-tRNA CUA pairs from Methanosarcina bar- 
ken (AftPylRS/PylT CUA ) and M. mazei (MmPylRS/PylT CUA ) are 
functional in Escherichia coli [78], Saccharomyces cerevisiae [79], 
mammalian cells [80], Caenorhabditis elegans [81,82], Drosophila 
melanogaster [83], an&Xenopus laevis oocytes [84]. Furthermore, 
the wild-type PylRS is capable of accommodating a broad range of 
unnatural lysine derivatives based on a carbamate linkage at the 8- 
amino group to which a variety of functional groups, including 
teri-butyl [85], azido, alkynyl [17], norbornene [23], and diazirine 
[86] have been accommodated. We first synthesized a small 
collection of aliphatic alkene-lysines to diversify the structure of 
bioconjugation handles and to explore the ability to accommodate 
long-chain alkenes and lysine-linkages other than carbamates by 
the MfcPylRS (Figure 1A and Schemes S1-S4). 

To investigate whether the synthesized alkene-lysines are 
substrates for the wild-type MfePylRS, the incorporation efficien- 
cies of 1-9 into myoglobin were evaluated by protein expression in 
E. coli. Cells were grown in the absence of an UAA and in the 
presence of 1-9. The amino acids 1, 2 and 3 have been previously 
described and incorporated into proteins using wild-type PylRS 
and/or PylRS mutants [22,85]. Here we found that additional 
analogs can be efficiendy incorporated into myoglobin by the 
M&PylRS. The obtained incorporation efficiencies and ESI-MS 
results are listed in Figure IB and the corresponding SDS-PAGE 
analysis is shown in Figure SI. 

Previous crystallographic studies of PylRS have indicated that 
the synthetase holds a large hydrophobic pocket, capable of 
accommodating bulky and hydrophobic moieties [87,88]. In 
addition, it has been found that the carbamate moiety at the lysine 
side-chain is an essential discriminator for substrate recognition. 
For instance, the oxygen atom adjacent to the side-chain carbonyl 
group in 1 interacts via a water-mediated hydrogen bond with the 
side-chain carbonyl group of Asn346, a key residue in establishing 
substrate recognition in PylRS [85,87]. We found that the amino 
acid binding pocket of MfcPylRS exhibited flexibility to accom- 
modate substrates 1, 2, 3, 6 and 7 with amino acids 1 and 7 
showing the highest incorporation efficiency, which could be 
explained by their smaller size. While the amino acids 4 and 5 
were not efficiendy incorporated into protein due to their longer 
carbon chains. 
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Figure 1. Genetic incorporation of alkene-lysine analogs into 
myoglobin by the wild-type M>PylRS/PylT CUA pair. (A) Structures 
of alkenyl lysine derivatives bearing an 8-carbamate linkage (1-6), an 
inverted carbamate 7, an amide 8, and an urea 9. (B) Myoglobin 
comparative incorporation efficiencies (%) and ESI-MS results. 
doi:10.1 371/journal.pone.01 05467.g001 

The successful incorporation of 1 and 7 into protein together 
with the inefficient substrate recognition of 8 and 9 by MfePylRS 
suggests that the presence of an oxygen atom adjacent to the side- 
chain carbonyl group favors the hydrogen-bond network to be 
established more efficiently. We hypothesize that the recognition 
of 7 by MfiPylRS may be possible by re-directing the necessary 
interactions of the synthetase's binding pocket to the Opposition. 
Moreover, we have previously observed a preference for the 
carbamate moiety over an amide group to drive the efficient 
genetic encoding of e-iV-propargyloxycarbonyl-lysine by the wild- 
type M/;PylRS/PylT CUA pair, while its amide analog z-N- 
pentynoyl-lysine was not accepted as a substrate [17]. Although 
analogs that bear a side-chain amide moiety have been 
incorporated into proteins by wild-type PylRS, so far only 
structures with up to four atom bonds in length from the amide 
s-amino group have been tolerated by the enzyme's binding 
pocket [89,90]. Since our amino acid 8 is a bond longer, we can 
speculate that the carbamate functionality in 1, compared to 8, 
assists in an increase of substrate recognition efficiency by 
AftPylRS, as the enzyme showed to also tolerate the lengthier 
amino acids 2 and 3. The amino acid 9, which bears a urea 
linkage, seemed to be slighdy favored by M&PylRS compared to 
the amide 8. However, the amino acid 9 still proves to be a poor 
substrate compared to 1. Our findings suggest that the replace- 
ment of the oxygen atom on the carbamate by a carbon or 
nitrogen atom may be enough to discriminate between the very 
similar substrates 1, 8, and 9, possibly due to weaker interactions 
with the amide nitrogen atom or urea functionalities, thus not 
favoring an efficient binding of 8 or 9 into the MfcPylRS amino 
acid pocket. 
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With amino acids 1-3 showing good incorporation efficiency, 
we site-specifically incorporated these amino acids into superfolder 
Green Fluorescent Protein (slGFP) as a second model protein in E. 
coli. We found that alkene lysines 1-3 were successfully introduced 
at position Y 1 5 1 in sfGFP (Figure 2 A) and that sfGFP yields were 
obtained at 32-70 mg/L, an approximately 10-fold increase of 
incorporation efficiency compared to myoglobin bearing the same 
amino acids 1-3. ESI-MS analysis of purified sfGFP shows 
molecular weights corresponding to the site-specific incorporation 
of 1, 2, and 3 (Figure 2B). 

sfGFP labeling via the thiol-ene reaction 

To verify that the thiol-ene reaction is suitable for labeling the 
alkene-bearing sfGFP, dansyl-thiol (10) was used as a fluorescent 
probe (Figure 3A and Scheme S5). Wild-type sfGFP and modified 
sfGFPs carrying 1 or 2, which showed the highest incorporation 
efficiency, were subjected to a thiol-ene reaction with 10 by 
irradiating the reaction mixture with 365 nm UV light in the 
presence of the photoinitiator 12959 for 5 min. Both samples were 
then analyzed by SDS-PAGE gel and in-gel fluorescence imaging. 
Figure 3B shows that the alkene-containing sfGFPs modified with 
1 and 2 were both selectively labeled with 10 after UV irradiation 
while the wild-type sfGFP was not fluorescently labeled. These 
results demonstrate that a thiol-containing fluorescence probe 
could be site-specifically conjugated to sfGFP bearing an alkene 
functional group. 

In order to show the potential of the thiol-ene reaction in 
protein chemistry, we hypothesized that cysteine residues in 
another protein could also be used as a possible reaction partner, 
leading to the formation of a non-linear protein heterodimer 
(Figure 3A). Lysozyme is a small protein containing 8 cysteine 
residues within 129 amino acids [91]. The cysteines form 4 
disulfide bonds and can be reduced to release free thiol groups. 
Analysis of bioconjugated proteins by SDS-PAGE revealed bands 
of expected molecular weight, as the bands corresponding to 
sfGFP increased from 28 kD to 44 kD via conjugation to lysozyme 
after UV exposure in the presence of the photoinitiator 12959 for 
10 min (Figure 3C). This result indicates that the majority of the 
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Figure 2. Genetic incorporation of alkene-lysine analogs 1, 2 
and 3 into sfGFP. (A) SDS-PAGE analysis of purified sfGFP. -AA: no 
UAA was supplemented; WT: wild-type sfGFP; 1, 2 and 3: expression in 
the presence of the corresponding UAA (1 mM], (B) Protein yields 
(*wild-type sfGFP yield is 70 mg/L, 100%) and ESI-MS results. 
doi:1 0.1 371 /journal.pone.01 05467.g002 



observed products are sfGFP-lysozyme heterodimers since lyso- 
zyme was supplied in 4 fold excess compared to alkenyl sfGFP. 
Without UV irradiation, no significant mobility shift was observed. 
As expected, wild-type sfGFP did not undergo a thiol-ene reaction 
with lysozyme. Overall, a successful protein-protein heterodimer 
formation via thiol-ene conjugation of an alkene-containing 
protein was achieved. 

In both bioconjugation strategies we found that the addition of 
sodium dodecyl sulfate (SDS) was necessary for an efficient and 
specific conjugation reaction to alkene-labeled proteins within 5— 
10 min, in contrast to previously reported 1-2 h reaction times 
[22,29,43], thus significantly reducing UV exposure. We found 
that under our experimental conditions, lysozyme is (at least 
partially) denatured [92,93], as confirmed by circular dichroism 
(CD) spectroscopy (Figure S2). This may result in more accessible 
cysteine residues and facilitate the thiol-ene bioconjugation 
reaction. Moreover, as a well-known surfactant, SDS has been 
proposed to form micelles in thiol-ene reactions for water-based 
polymerization reactions [94,95] . It is possible that the association 
of the proteins with micelles may increase their local concentra- 
tion, thus further facilitating the reaction. 

Conclusions 

In conclusion, we have synthesized a collection of alkene lysines 
of varying length and s-linkages and demonstrated their site- 
specific, genetically encoded incorporation into proteins in E. coli 
by the wild-type MfePylRS/PylT GUA pair. The alkene-containing 
amino acids 1-3 showed the highest incorporation efficiencies into 
myoglobin and protein yields decreased with increasing side-chain 
length, hinting the limitations of the wild-type synthetase's binding 
pocket to accommodate sterically demanding amino acids. Among 
these amino acids, we also successfully incorporated the amino 
acid 7 with an inverted carbamate functionality at the s-position of 
lysine. Replacement of the carbamate motif for an amide or urea 
failed to provide efficient incorporation into protein, once again 
suggesting that the carbamate moiety at the lysine side-chain can 
be an essential discriminator for substrate recognition by wild-type 
PylRS. 

Next, the alkene amino acids 1, 2, and 3 were successfully 
incorporated into sfGFP, with 1 and 2 exhibiting the highest 
incorporation efficiency. Utilizing the thiol-ene reaction, alkene- 
bearing sfGFP was site-specifically bioconjugated to a dansyl-thiol 
fluorophore (10) upon irradiation with 365 nm of UV light in the 
presence of photoinitiator 12959 after only 5 min. In addition, we 
applied the site-specific genetic incorporation of alkene-bearing 
amino acids into proteins in the direct, spacer-free synthesis of a 
non-linear protein fusion of sfGFP and lysozyme. All components 
are recombinantly expressed and no post-translational introduc- 
tion of functional groups was required. The work described herein 
demonstrates for the first time the assembly of a protein 
heterodimer by means of a light-induced thiol-ene ligation using 
genetically encoded alkene-bearing UAAs. This approach may 
become a promising tool to create non-linear proteins directly, 
with minimal synthetic effort, by creating direct protein-to-protein 
conjugations. 

Materials and Methods 

Synthesis of alkene lysines: general considerations 

Unless otherwise stated, all reagents used were commercial 
reagents used without purification and reactions were performed 
under nitrogen using flame-dried glassware. The 'H NMR and 
13 C NMR spectra were recorded on a 300 MHz or 400 MHz 
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Lanes 1 2 3 4 5 6 Lanes 123456789 



Figure 3. Alkenyl-sfGFP is fluorescently labeled with dansyl-thiol, and bioconjugated to lysozyme to assemble a non-linear protein 
dimer via the thiol-ene reaction. (A) sfGFP bearing an alkene functionality reacts photochemically with dansyl-thiol (10) or lysozyme (LYZ). (B) 
SDS-PAGE analysis demonstrates the labeling of alkenyl-sfGFP with 1 0 after 5 min of UV irradiation via thiol-ene ligation (lanes 5 and 6). Fluorescence 
(top) and Coomassie stain (bottom). (C) SDS-PAGE analysis shows mobility band shifts from 28 kD to 44 kD after samples were UV irradiated for 
10 min (lanes 8 and 9), corresponding to the molecular weight of sfGFP-lysozyme conjugate. WT: wild-type sfGFP; 1 and 2: sfGFP carrying the 
corresponding UAA; LYZ: lysozyme. -UV: samples were not exposed to UV irradiation. +UV: samples were irradiated at 365 nm for 5 or 10 min. 
doi:1 0.1 371 /journal.pone.01 05467.g003 



Varian NMR spectrometer. The amino acid 1 was purchased 
from Chem-Impex International, Inc. For synthesis schemes of 2 
10, please refer to Schemes S1-S5. 

Procedure for the synthesis of 2a-5a 

But-3-en-l-yl (2,5-dioxopyrrolidin-l-yl) carbonate 

(2a). iV,iV'-Disuccinimidyl carbonate (353 mg, 1.38 mmol) was 
added to a solution of 3-buten-l-ol (60 mL, 0.69 mmol) and 
triefhylamine (288 |_lL, 2.07 mmol) in dry acetonitrile (5 mL) at 
room temperature. The resulting mixture was stirred overnight 
and then concentrated under vacuum. The product was purified 
by column chromatography on Si0 2 gel, eluted with 97:2:1 
DCM/acetone/TEA to deliver 2a (96 mg, 65%) as a colorless oil. 
'H-NMR (400 MHz, CDC1 3 ): 8 5.73 (m, 1 H), 5.15-5.08 (m, 
2 H), 4.31 (t,/= 6.4 Hz, 2 H), 2.78 (s, 4 H), 2.46 (m, 2 H) ppm; 
l:5 C-NMR(100 MHz, CDC1 3 ): 5 168.8, 151.5, 132.4, 118.4, 70.2, 
32.7, 25.4 ppm. 

2,5-Dioxopyrrolidin-l-yl pent-4-en-l-yl carbonate 

(3a). Compound 3a (193 mg, 73%) was obtained as a colorless 
oil from 4-penten- 1 -ol (0.12 mL, 1.16 mmol) by following the 
procedure described above. 'H-NMR (400 MHz, CDC1 3 ): 8 5.78 
(m, 1 H), 5.09-5.01 (m, 2 H), 4.34 (t,/=6.4 Hz, 2 H), 2.83 (s, 
4 H), 2.17 (m, 2 H), 1.85 (m, 2 H) ppm; l3 C-NMR (100 MHz, 
CDC1 3 ): 8 168.6, 151.4, 136.5, 115.8, 70.6, 29.2, 27.3, 25.3 ppm. 

2,5-Dioxopyrrolidin-l-yl hex-5-en-l-yl carbonate 
(4a). Compound 4a (232 mg, 77%) was obtained as a colorless 
oil from 5-hexen-l-ol (0.15 mL, 1.25 mmol) by following the 
procedure described above. 'H-NMR (400 MHz, CDC1 3 ): 8 5.73 
(m, 1 H), 5.05-4.91 (m, 2 H), 4.27 (t,/=6.4 Hz, 2 H), 2.77 (s, 



4 H), 2.04 (m, 2 H), 1.70 (m, 2 H), 1.45 (m, 2 H) ppm; 13 C-NMR 
(100 MHz, CDCI3): 8 169.2, 151.9, 138.2, 115.5, 71.7, 33.3, 28.0, 
25.7, 24.9 ppm. 

2,5-Dioxopyrrolidin-l-yl hept-6-en-l-yl carbonate 

(5a). Compound 5a (202 mg, 67%) was obtained as a colorless 
oil from 6-hepten- 1 -ol (0.16 mL, 1.17 mmol) by following the 
procedure described above. 'H-NMR (400 MHz, CDC1 3 ): 8 5.77 
(m, 1 H), 5.00-4.91 (m, 2 H), 4.28 (t, /=6.8 Hz, 2 H), 2.80 (s, 
4 H), 2.05 (m, 2 H), 1.72 (m, 2 H), 1.39 (m, 4 H) ppm; 13 C-NMR 
(100 MHz, CDC1 3 ): 8 183.6, 168.9, 151.6, 138.5, 114.7, 71.6, 
33.5, 28.3, 25.5, 24.9 ppm. 

Procedure for the synthesis of 2b-5b 

(S)-6-(((But-3-en-l-yloxy)carbonyl)amino)-2-((tert-butoxycarbonyl) 
amino)hexanoic acid (2b). Boc-L-Lys-OH (218 mg, 0.88 mmol) was 
added to a stirred solution of 2a (157 mg, 0.74 mmol) in dry DMF (2 mL). 
The reaction was allowed to continue overnight at room temperature. The 
mixture was diluted in water ( 1 0 mL) and extracted with EtOAc (3 x 1 0 mL). 
The combined organic layers were washed with water (3 x20 mL) and brine 
(10 mL). The resulting organic layer was dried over Na 2 S04, filtered and 
concentrated in vacuo to dryness to furnish 2b (2 1 9 mg, 86%) as an off-white 
foam. 'H-NMR (400 MHz, CDC1 3 ): 8 1 1.15 (s, br, 1 H), 6.26 (s, br, 0.5 H), 
5.75 (m, 1 H), 5.37 (m, br, 1 H), 5.09-5.01 (m, 2 H), 4.87, (s, br, 0.5 H), 4.26 
(s, br, 1 H), 4.08 (m, br, 2 H), 3.13 (m, 2 H), 2.34 (m, 2 H), 1.81-1.40 (m, 
15 H) ppm; 13 C-NMR (100 MHz, CDCk): 8 176.5, 157.2, 156.0, 138.5, 
114.9, 80.1, 65.0, 53.3, 40.6, 33.5, 32.2, 29.5, 28.5, 22.5 ppm. 

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((pent-4-en-l-yloxy) 
carbonyl) amino) hexanoic acid (3b). Compound 3b (517 mg, 
92%) was obtained as an off-white foam from 3a (335 mg, 
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1 .47 mmol) by following the procedure described above. 'H-NMR 
(300 MHz, CDC1 3 ): 8 6.30 (m, br, 0. 5 H), 5.80 (m, 1 H), 5.27 (m, 
br, 1 H), 5.06-4.96 (m, 2 H), 4.80 (s, br, 0.5 H), 4.30 (s, br, 1 H), 
4.06 (m, br, 2 H), 3.17 (m, 2 H), 2.10 (m, 2 H), 1.84-1.65 (m, 
4 H), 1.58-1.35 (m, 13 H) ppm; 13 C-NMR (100 MHz, CDC1 3 ): 8 
176.2, 157.1, 155.9, 137.7, 115.2, 80.3, 64.7, 53.5, 40.9, 32.5, 
30.3, 29.7, 28.7, 22.8 ppm; HRMS-ESI (m/z): [M+K] + calcd for 
C 17 H 30 N 2 O 6 397.1741, found 397.1726. 

(S)-2-((tert-Butoxycarbonyl)amino)-6-(((hex-5-en-l-yloxy) 
carbonyl)amino)hexanoic acid (4b). Compound 4b 
(177 mg, 93%) was obtained as an off-white foam from 4a 
(123 mg, 0.51 mmol) by following the procedure described 
above. 'H-NMR (400 MHz, CDC1 3 ): 8 8.40 (s, br, 1 H), 6.29 
(s, br, 0.5 H), 5.78 (m, 1 H), 5.31 (m, br, 1 H), 5.02-4.93 (m, 
2 H), 4.90 (s, br, 0.5 H), 4.35 (s, br, 1 H), 4.05 (m, br, 2 H), 3.16 
(m, 2 H), 2.06 (m, 2 H), 1.82-1.43 (m, 19 H) ppm; 13 C-NMR 
(100 MHz, CDCI3): 8 176.3, 157.9, 155.7, 138.5, 114.8, 80.1, 
64.9, 53.2, 40.6, 33.4, 32.1, 29.4, 28.5, 28.4, 25.2, 22.5 ppm; 
HRMS-ESI (m/z): [M+Na] + calcd for C 18 H 32 N 2 0 6 395.2158, 
found 395.2145. 

(S)-2-( (tert-Butoxycarbonyl) amino) -6-( ( (hept-6-en-l -yloxy) 
carbonyl)amino)hexanoic acid (5b). Compound 5b (148 mg, 
93%) was obtained as an off-white foam from 5a (105 mg, 
0.41 mmol) by following the procedure described above. 'H-NMR 
(300 MHz, CDCI3): 8 8.48 (s, br, 1 H), 6.33 (s, br, 0.5 H), 5.78 (m, 

1 H), 5.30 (m, br, 1 H), 5.01-4.88 (m, 2.5 H), 4.29 (s, br, 1 H), 
4.05 (m, br, 2 H), 3.16 (m, 2 H), 2.03 (m, 2 H), 1.78-1.18 (m, 
21 H) ppm; 13 C-NMR (75 MHz, CDC1 3 ): 8 176.5, 157.3, 155.8, 
138.9, 114.7, 80.3, 65.2, 53.3, 40.6, 33.8, 32.1, 29.6, 29.0, 28.7, 
28.5, 25.5, 22.5 ppm; HRMS-ESI (m/z): [M+H| + calcd for 
C 13 H 24 N 2 04 273.1809, found 273.1876. 

Procedure for the Boc-deprotection, 2-5 

(S)-2-Amino-6-(((but-3-en-l-yloxy)carbonyl)amino)hexanoic 
acid HC1 salt (2). To a solution of 2b (1 10 mg, 0.32 mmol) and 
Et 3 SiH (0.1 mL, 0.64 mmol) in dry DCM (4.5 mL), trifluoroacetic 
acid (0.24 mL, 3.2 mmol) was added dropwise, and the reaction 
mixture was allowed to stir at room temperature overnight. The 
volatiles were removed under reduced pressure and the residue was 
dissolved in a solution of 4 N HC1 in 1,4-dioxane (0.25 mL) and 
DCM (0.75 mL), allowed to stir for 10 min at room temperature and 
then concentrated. The latter process was repeated two more times 
to ensure complete TFA to HC1 salt exchange. The concentrated 
residue was dissolved in a minimal amount of MeOH and was 
precipitated into ice-cold Et 2 0. The precipitate was pelleted by 
centrifugation, the supernatant decanted, and the solid was washed 
with Et 2 0 before drying under vacuum, affording the amino acid 2 
(82.2 mg, 92%) as a white solid. 'H-NMR (400 MHz, DMSO-d6): 8 
8.45 (s, br, 3 H), 7.09 (s, br, 1 H), 5.75 (m, 1 H), 5.10-5.01 (m, 2 H), 
3.94 (t, / = 6.8 Hz, 2 H), 3.77 (t, / = 6.4 Hz, 1 H), 2.92 (m, 2 H), 
2.23 (m, 2 H), 1.75 (m, 2 H), 1.36-1.26 (m, br, 4 H)ppm; l3 G-NMR 
(100 MHz, DMSO-d6): 8 170.9, 156.2, 134.8, 117.0, 62.7, 51.8, 
33.2, 29.6, 28.9, 21.6 ppm; HRMS-ESI (m/z): [M+H] + calcd for 
C11H20N2O4 245.1496, found 245.1490. 

(S)-2-Amino-6- ( ( (pent-4-en- 1-yloxy) carbonyl) amino) hexanoic 
acid HC1 salt (3). Deprotection of 3b (0.5 g, 1.39 mmol) was 
performed as described above to obtain 3 (0.40 g, 97%) as a white 
solid. 'H-NMR (400 MHz, D 2 0): 8 5.86 (m, 1 H), 5.07-1.98 (m, 

2 H), 4.05-4.00 (m, 3 H), 3.11 (t,./ = 5.2 Hz, 2 H), 2.10 (m, 2 H), 
1.97-1.87 (m, 2 H), 1.69 (t,/=6.4 Hz, 2 H), 1.56-1.39 (m, 4 H) 
ppm; 13 C-NMR (75 MHz, D 2 0): 8 172.0, 158.9, 138.6, 114.9, 65.0, 
52.7, 39.9, 29.6, 29.4, 28.5, 27.5, 21.6 ppm; HRMS-ESI (m/z): \M+ 
H] + calcd for Ci 2 H 22 N 2 0 4 259.1652, found 259.1653. 



(S)-2-Amino-6-(((hex-5-en-l-yloxy)carbonyl)amino)hexanoic 
acid HC1 salt (4). Deprotection of 4b (145 mg, 0.39 mmol) was 
performed as described above to obtain 4 (108.4 mg, 90%) as a white 
solid. 'H-NMR (400 MHz, DMSO-d6): 8 8.28 (s, br, 3 H), 7.08 (s, 
br, 1 H), 5.79 (m, 1 H), 5.02-4.93 (m, 2 H), 3.91 (t,/= 6.8 Hz, 2 H), 
3.67 (s, br, 2 H), 2.94 (m, 2 H), 2.03 (m, 2 H), 1.76 (m, br, 2 H), 1.52 
(t,/ = 6.8 Hz, 2 H), 1.38 (m, br, 6 H) ppm; "C-NMR (100 MHz, 
DMSO-d6):8 171.1, 156.3, 138.5, 115.0, 63.4, 52.3, 32.8, 29.8, 29.0, 
28.2, 24.6, 21.7 ppm; HRMS-ESI (m/z): [M+H] + calcd for 
C 13 H 24 N 2 0 4 273.1809, found 273.1803. 

(S)-2-Amino-6- ( ((hept-6-en- 1 -yloxy) carbonyl) amino) hexanoic 
acid HC1 salt (5). Deprotection of 5b (130 mg, 0.336 mmol) was 
performed as described above to obtain 5 (104.4 mg, 96%) as a white 
solid. 'H-NMR (400 MHz, DMSO-d6): 8 8.45 (s, br, 3 H), 7.09 (m, 
br, 1 H), 5.78 (m, 1 H), 5.01-1.92 (m, 2 H), 3.90 (t,/= 6.4 Hz, 2 H), 
3.82 (s, br, 1 H), 2.93 (m, 2 H), 2.01 (m, 2 H), 1.77 (m, br, 2 H), 1.52 
(m, 2 H), 1.45-1.28 (m, 10 H) ppm; 13 C-NMR (100 MHz, DMSO- 
d6): 8 171.0, 156.3, 138.7, 114.8, 63.5, 51.8, 33.1, 29.6, 28.9, 28.6, 

28.0, 24.9, 21.6 ppm; HRMS-ESI (m/z): [M+H] + calcd for 
C 14 H 26 N 2 0 4 287.1965, found 287.1957. 

Procedure for the synthesis of 6 

(S)-6-(((But-2-en-l-yloxy)carbonyl)amino)-2-((tert-butoxycarbonyl) 
amino)hexanoic acid (6b). Diphosgene (0.26 mL, 2.16 mmol) was 
added dropwise to an ice-cold mixture of 2-buten-l-ol (aslrans isomers, 
— 1:19) (0.12 mL, 1 .66 mmol) and potassium carbonate (0.69 g, 4.98 mmol) 
in dry Et z O (5 mL). The resulting mixture was allowed to stir overnight at 
room temperature, filtered and carefully concentrated under reduced 
pressure to avoid loss of the volatile product The chloroformate 6a was 
obtained as a clear liquid and without further purification it was dissolved in 
THF (1 mL). Then, it was added dropwise to an ice-cold solution of Boc-L- 
Lys-OH (495 mg, 2.0 mmol) in 1 M NaOH aqueous (1 mL) and THF 
(4 mL). The reaction was allowed to run overnight at room temperature. 
The volatiles were removed under reduced pressure and the residue was 
diluted in water and then washed with EtOAc (10 mL). The water layer was 
acidified with 5% citric acid to pH 3-4 and extracted with EtOAc 
(3x10 mL). The combined organic layers were washed with water (20 mL) 
and brine (10 mL). The resulting organic layer was dried over Na 2 S0 4 , 
filtered and concentrated in vacuo to dryness to furnish fib (343 mg, 60%) as 
an off-white foam. 'H-NMR (400 MHz, CDCk): 8 8.40 (s, br, 1 H), 6.29 (s, 
br, 0.5 H), 5.78 (m, 1 H), 5.31 (m, br, 1 H), 5.02-4.90 (m, 2.5 H), 4.29 (s, br, 

1 H), 4.05 (m, br, 2 H), 3.15 (m, 2 H), 2.07 (m, 2 H), 1.81-1.40 (m, 15 H) 
ppm; 13 C-NMR (75 MHz, CDCI3): 8 176.4, 156.9, 156.0, 131.0, 125.9, 

80.1, 65.7, 53.3, 40.6, 32.2, 29.5, 28.5, 22.5, 17.9 ppm; HRMS-ESI (m/z): 
[M-H]" calcd for C 16 H 28 N 2 0 6 343.1864, found 343.1869. 

(S)-2-Amino-6-(((but-2-en-l-yloxy)carbonyl)amino)hexanoic 
acid TFA salt (6). To a solution of 6b (317 mg, 0.92 mmol) and 
Et 3 SiH (0.29 mL, 1.84 mmol) in dry DCM (13 mL), trifluoroacetic 
acid (0.68 mL, 9.20 mmol) was added dropwise, and the reaction 
mixture was allowed to stir at room temperature overnight. The 
volatiles were removed under reduced pressure and the residue was 
dissolved in a minimal amount of MeOH and precipitated into ice- 
cold Et 2 0. The precipitate was pelleted, the supernatant decanted, 
and the solid was washed with Et 2 0 before drying under vacuum, 
affording the amino acid 6 (288 mg, 87%) as a white solid. 'H-NMR 
(400 MHz, D 2 0): 8 5.80 (m, 1 H), 5.58 (m, 1 H), 4.43 (d,/ = 5.6, 

2 H), 3.85 (m, 1 H), 3.08 (t,/ = 6.0 Hz, 2 H), 1.88 (t,/ = 6.0 Hz, 
2 H), 1.66 (d, / = 6.4, 3 H), 1.53-1.33 (m, 4 H) ppm; 13 C-NMR 
(100 MHz, DMSO-d6): 8 171.3, 156.1, 129.6, 126.6, 64.1, 52.7, 
38.4, 30.1, 29.1, 26.5, 21.9, 21.6, 17.5 ppm; HRMS-ESI (m/z): \M+ 
H] + calcd for C„H 20 N 2 O 4 245.14958, found 245.14970. 
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Procedure for the synthesis of 7 

(S) -6- ( (Allylc arb amoyl) oxy) -2- ( (tert-butoxy carbonyl) amino) 
hexanoic acid (7a). 6-Hydroxy-Boc-L-norleucine-OH (25 mg, 
0.10 mmol) was dissolved in a solution of dry DCM (1 mL) and 
DIPEA (53 uL, 0.30 mmol). The solution was chilled to 0°C before 
the addition of allyl isocyanate (18 U.L, 0.20 mmol) and the reaction 
was allowed to proceed at 40°C overnight. After cooling to room 
temperature, the mixture was diluted with DCM (3 mL) and 5% 
citric acid (4 mL) was added. The aqueous layer was extracted with 
DCM (3x4 mL) and the combined organic layers were washed with 
water (10 mL) and brine (5 mL). The resulting organic layer was 
dried over Na 2 S0 4 , filtered and concentrated in vacuo to dryness to 
furnish 7a (29 mg, 89% yield) as an off-white foam. 1 H-NMR 
(400 MHz, CDC1 3 ): 8 5.85 (m, 1 H), 5.24-5.07 (m, 2 H), 4.74 (m, 
br, 1 H), 4.29, (s, br, 1 H), 4.06 (t,/ = 5.6 Hz, 2 H), 3.78 (m, 2 H), 
1.93-1.25 (m, 15) ppm; 13 C-NMR (100 MHz, CDC1 3 ): 8 176.0, 
155.8, 135.0, 116.4, 80.3, 64.8, 53.3, 43.3, 32.2, 28.7, 28.5, 
22.0 ppm; HRMS-ESI (m/z): [M+Na] + calcd for C 15 H 26 N 2 0 6 
353.1689, found 353.1654. 

(S)-6-((AUylcarbamoyl)oxy)-2-aminohexanoic acid TFA 
salt (7). Deprotection of 7a (28 mg, 0.085 mmol) was per- 
formed by following the procedure described for compound 6 to 
afford compound 7 (28.8 mg, 96%) as a white solid. 'H-NMR 
(400 MHz, D 2 0): 8 5.86 (m, 1 H), 5.19-4.80 (m, 2 H), 4.08 (t, 
/ = 6.0 Hz, 1 H), 3.94 (t, / = 5.6 Hz, 2 H), 3.72 (m, 2 H), 1.95 (m, 
2 H), 1.69 (m, 2 H), 1.50 (m, 2 H) ppm; 13 C-NMR (100 MHz, 
D 2 0): 8 173.2, 159.0, 135.4, 115.1, 63.1, 53.7, 42.1, 29.7, 28.0, 
21.0 ppm; HRMS-ESI (m/z): [M+Na] + calcd for C 15 H 26 N 2 0 6 
353.1689, found 353.1654. 

Procedure for the synthesis of 8 

(S)-2-((tert-Butoxycarbonyl)aiiiino)-6-(pent-4-enaniido)hexanoic 
acid (8a). Compound 8a (212 mg, 97%) was obtained as an olf-white 
foam from 2,5-dioxopyrrolidin-l-yl pent-4-enoate [96] (132 mg, 
0.67 mmol) by following the procedure described for compound 2b. 
'H-NMR (400 MHz, CDC1 3 ): 8 9.56 (s, br, 1 H), 6.20 (s, br, 1 H), 5.78 
(m, 1 H), 5.34 (m, 1 H), 5.06-4.97 (m, 2 H), 4.25 (m, br, 1 H), 4.46 (s, br, 

1 H), 3.23 (m, 2 H), 2.34 (m, 2 H), 2.27 (m, 2 H), 1.82-1.66 (m, br, 2 H), 
1.59-1.35 (m, 13 H)ppm; 13 C-NMR(100 MHz, CDC1 3 ): 8 175.4, 173.5, 
156.0, 137.0, 115.8, 80.1, 53.2, 39.3, 35.8, 32.3, 29.8, 28.9, 28.4, 
22.5 ppm; HRMS-ESI {m/z): [M+K] + calcd for C 16 H 28 N 2 0 5 367.1635, 
found 367.1625. 

(S)-2-Amino-6-(pent-4-enamido)hexanoic acid HC1 salt 

(8). Deprotection of 8a (175 mg, 0.54 mmol) was performed 
by following the procedure described for compound 2 to afford 
compound 8 (142 mg, 99%) as a white solid. 'H-NMR (400 MHz, 
DMSO-d6): 8 8.45 (s, br, 3 H), 7.93 (m, br, 1 H), 5.77 (m, 1 H), 
5.02-4.92 (m, 2 H), 3.95 (m, br, 1 H), 3.81 (s, br, 1 H), 3.00 (m, 

2 H), 2.25-2.14 (m, 4 H), 1.79 (m, br, 2 H), 1.40-1.29 (m, br, 
4H) ppm; 13 C-NMR (100 MHz, DMSO-d6): 8 171.3, 171.0, 
137.8, 1 15.0, 51.9, 38.0, 34.5, 29.6, 29.3, 28.6, 21.7 ppm; HRMS- 
ESI (m/z): [M+H] + calcd for C n H 2() N 2 0 3 229.1547, found 
229.1545. 

Procedure for the synthesis of 9 

(S)-6-(3-AUylureido)-2-((tert-butoxycarbonyl)amino)hexanoic 
acid (9a) . Allyl isocyanate ( 1 00 wL, 1.13 mmol) was dissolved in dry 
DMF (2 mL) and the solution was chilled to 0°C before adding Boc- 
Lys-OH (334 mg, 1.36 mmol) and DMAP (166 mg, 1.36 mmol). The 
reaction was heated at 70°C overnight. After cooling to room 
temperature, the mixture was diluted in water (6 mL) and extracted 
with EtOAc (3x6 mL). The combined organic layers were washed 
with water (3x15 mL) and brine (8 mL). The resulting organic layer 
was dried over Na 2 S0 4 , filtered and concentrated in vacuo to dryness 



to furnish 9a (213 mg, 57% yield) as an ofF-white foam. 'H-NMR 
(400 MHz, CDC1 3 ): 8 8.78 (s, b, 1 H), 6.19 (s, br, 0.5 H), 5.87-5.77 
(m, 1 H), 5.46 (d,/= 7.2 Hz, 1 H), 5.20-5.09 (m, 2 H), 4.25 (m, br, 

1 H), 4.11 (m, br, 0.5 H), 3.77 (m, br, 2 H), 3.12 (m, br, 2 H), 1.80- 
1.68 (m, 2 H), 1.58-1.43 (m, 13 H) ppm; 13 C-NMR (100 MHz, 
CDC1 3 ): 8 176.0, 159.7, 156.0, 135.2, 116.1, 80.2, 53.5, 43.2, 40.3, 
32.3, 29.4, 28.6, 22.5 ppm; HRMS-ESI (m/z): [M+Na] + calcd for 
C 15 H 27 N 3 0 5 352.1848, found 352.1845. 

(S)-6-(3-AUylureido)-2-aminohexanoic acid TFA salt 
(9). Deprotection of 9a (53.7 mg, 0.16 mmol) was performed 
by following the procedure described for compound 6 to afford 
compound 9 (50 mg, 94%) as a white solid. 'H-NMR (400 MHz, 
D z O): 8 5.62-5.55 (m, 1 H), 4.93-4.84 (m, 2 H), 3.81 (t, 
/ = 6.0 Hz, 1 H), 3.45 (m, 2 H), 2.85 (m, 2 H), 1.72-1.58 (m, 

2 H), 1.30-1.09 (m, 4 H) ppm; 13 C-NMR (100 MHz, D 2 0): 8 
172.0, 160.5, 135.1, 114.6, 52.7, 42.0, 39.4, 29.4, 28.7, 21.5 ppm; 
HRMS-ESI (m/z): [M+H] + calcd for C 10 H 19 N 3 O 3 230.15, found 
230.14. 

Procedure for the synthesis of 10 

5-(Dimethylamino)-N-(2-sidfanylethyl)naphthalene-l-Sulfonarnide 
(dansyl-thiol, 10). A solution of dansyl chloride (150 mg, 0.56 mmol) and 
TEA (194 uL, 1.39 mmol) in dry DCM (0.8 mL) was cooled to 0°C and 
added dropwise into an ice-cold solution of cysteamine (86 mg, 1.11 mmol) in 
dry DCM (1 mL). The reaction was allowed to stir at room temperature for 

3 h, was concentrated, and the product was purified on silica gel, eluting with 
97:2:1 DCM/Hexanes/TEA to furnish 10 (51.4 mg, 30%) as a yellow film 
Characterization data matched with literature [97]. 

Myoglobin Expression in E. coli 

Plasmids, pMyo4TAGpylT and pBKpylS, were co-transformed 
into E. coli Top 10 cells as previously described [98] and selected 
with 25 ug/mL tetracycline and 50 |J,g/mL kanamycin. A single 
colony was used to inoculate 2 mL LB medium containing the 
same antibiotics and grown overnight. Next, 500 uL of culture was 
used to seed 50 mL of LB culture containing 1 mM of the 
corresponding UAAs and antibiotics. The pH was adjusted to 7 
with 10 M NaOH immediately before inoculation. Cells were then 
cultivated to OD 600 = 0.6 and 100 mL of 20% arabinose solution 
was supplemented to induce arabinose promoter driven expres- 
sion. The cells were cultivated at 37°C shaker overnight and 
harvested by centrifugation at 3000 g in standard 50 mL conical 
tubes. Lysis of the cell was conducted by re-suspending the cell 
pellets with standard Ni-NTA phosphate lysis buffer with lysozyme 
and 0.1% Triton X-100. After 1 hour of incubation at 4°C, cells 
were sonicated on ice to release the soluble portion and debris was 
removed by centrifugation. The cleared lysates were incubated 
with 100 uL of Qiagen Ni-NTA agarose slurry at 4°C to bind His- 
tagged myoglobin. The mixture was then centrifuged at 1000 g for 
5 min and agarose beads were collected and transferred to 
microcentrifugator filter columns. Beads were washed three times 
with 400 uL Ni-NTA lysis buffer and one time with 400 uL Ni- 
NTA wash buffer. The protein was eluted with 400 |jL of elution 
buffer. Eluted sample was mixed with SDS loading buffer, heated 
at 95°C for 5 min and loaded onto 10% SDS-PAGE gel with 
1.5 mm thickness and ran at 150 V for 50 min. The gel was 
stained overnight with Coomassie blue solution (0. 1 % Coomassie 
blue, 10% acetic acid, 40% ethanol), then de-stained (10% acetic 
acid, 40% ethanol) and analyzed (Figure SI). The protein was 
dialyzed in 1 L of 20 mM ammonium acetate buffer for mass 
spectrum analysis. 
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sfGFP Expression in E. coli 

The plasmid pMyo4TAGpylT [98] was modified by replacing 
the myoglobin coding sequence with the sfGFP gene with an 
amber stop codon mutation placed on Y151 position located on 
the outer beta sheet domain. The co-transformation was the same 
as described above but condensed culture protocol was used to 
maximize UAA yields. The 2 mL of overnight culture was scaled- 
up to 400 mL culture in 2x1 L Erlenmeyer flask and grown to 
ODgoo = 0.6. Cells were harvested in 4x50 mL conical tubes and 
re-suspended in 50 mL of LB medium containing 1 mM of the 
corresponding UAA, antibiotics, and 0. 1 % arabinose. Cells were 
re-suspended by incubating in a rotary shaker at 37°C for 10 min 
and collected in a 250 mL Erlenmeyer flask. The cells were 
induced for 4 h and harvested by centrifugation. Cell pellets were 
first suspended by 3.6 mL of 50 mM Tris-HCl pH 8.0, supple- 
mented with 2.4 mL of 4 M ammonium sulfate and extracted by 
three-phase partitioning method [99] with 6 mL of i-butanol and 
vigorous shaking. The aqueous bottom layer containing sfGFP was 
removed and dialyzed against 1 L Ni-NTA lysis buffer for 1-2 h 
to remove most of the ammonium sulfate. The dialyzed samples 
were filtered through 0.45 um disc filter before loading into Ni- 
NTA gravity column containing 0.5 mL bed volume. The proteins 
were bound and washed with 1 2 mL bed volume of lysis buffer, 
6 mL bed volume of wash buffer containing 50 mM imidazole 
and eluted with Ni-NTA elution buffer. Samples were analyzed by 
SDS-PAGE, dialyzed against PBS pH 7.4 for subsequent labeling 
reaction and then dialyzed against 20 mM ammonium acetate for 
mass spectrum analysis. 

Protein MS Analysis 

Protein MS was measured at the Genomics and Proteomics 
Core Laboratories, University of Pittsburgh. The protein solution 
was adjusted to 5 pmol/uL in 80% acetonitrile and 0.1% aqueous 
formic acid. The sample was injected into a Bruker micrOTOF 
with an Ultimate 3000 HPLC. The results were deconvoluted to 
calculate the molecular weight using HyStar. 

Thiol-ene Reactions with sfGFP 

A reaction buffer containing 30 |J.L of 1 M TrisHCl pH 6.8 
(120 mM), 50 uL of 10% SDS (2%), 50 uL of 10 mM TCEP 
(2 mM), and 120 |J,L of water was made. In another eppendorf 
tube, a solution of 10 mM photoinitiator 12959 containing 50% 
DMSO in water was prepared. Then 62.5 uL of 12959 were 
added to 250 uL of reaction buffer just before labeling. Next, 
2.5 uL of reaction buffer/photoinitiator mix were added to 20 |J,L 
of sfGFP (2400 ng/uL) and incubated at room temperature for 
10 min. Dansyl-thiol (10) 50X substrate solution was prepared 
with 10 uL of 100 mM TCEP (10 mM), 20 uL of 25 mM dansyl- 
thiol in DMSO and 70 |J,L of deionized water. Next, 16.8 uL of 
this solution was added to the reaction mixture and incubated at 
room temperature for another 10 min. Subsequendy, 250 |iL of 
IX SDS loading buffer containing 2-mercaptoethanol and 
additional 50 uL of 100 mM DTT were prepared for stopping 
the reaction. Then 1 2 uL of the reaction samples were aliquoted 
into 200 mL PCR tubes. Samples were placed on a standard UV 
transilluminator at 365 nm for 5 min and the reaction was 
stopped by adding 12 |jL of IX SDS loading buffer to the mixture 
and heated at 95°C for 5 min. Next, 5 |J,L of the samples were 
loaded onto a 10% SDS-PAGE gel with 1.5 mm thickness and ran 
at 1 50 V for 50 min. After electroporation, gels were rinsed briefly 
with deionized water and imaged. Gels were stained with 
coomassie blue and scanned to visualize the protein bands. 



For protein heterodimer formation, the thiol-ene conjugation 
was carried out with a denatured and reduced lysozyme solution. 
A reaction buffer containing 120 |J,L of 1 M TrisHCl pH 6.8, 
200 uL of 10% SDS, and 1 mL of 10 mM TCEP was prepared. 
In 1.32 mL of the reaction buffer, 19 mg of lysozyme were 
dissolved (1 mM). The solution was sealed in a 2 mL micro- 
centrifugation tube with a rubber septum and purged with 
nitrogen for 30 min. The tube containing the protein solution 
was then heated at 75°C for 30 min. Photo-initiator 12959 was 
diluted to 10 mM in a solution of 50% DMSO in water. sfGFP 
solutions were adjusted to a concentration of 23 uM (650 ng/uL) 
and 20 |J,L of this solution was mixed with 2 |iL of reduced 
lysozyme and 0.5 uL of 12959 in the dark. The PCR tube 
containing this mixture was then placed on a standard UV 
transilluminator at 365 nm for 10 min. A solution containing 
120 uL of 1 M Tris-HCl pH 6.8, 20 uL of 10% SDS, 10 uL of 
1 M TCEP and 200 uL of glycerol was prepared and 20 uL of it 
were immediately added to the reaction solution after irradiation. 
15 uL of the resulted sample was loaded onto native or 10% SDS- 
PAGE gel following standard procedures. 

Circular dichroism analysis of lysozyme 

CD experiments were performed on an Olis Circular Dichroism 
Spectrophotometer using 0.1 cm quartz cuvettes. A solution 
containing 19 mg of lysozyme in 1.32 mL 10 mM phosphate 
buffer pH 7.4 was prepared. SDS was added to the final 
concentration of 2%. The lysozyme concentration was diluted to 
20 uM for the CD experiment, and CD spectra were collected 
from 195 to 260 nm in 1 nm increments with an integration time 
of 5 s and a bandwidth of 2 nm. Increased intensity in the far-UV 
spectrum with the addition of SDS (Figure S2) is in agreement 
with previous observations [100]. 

Supporting Information 

Figure SI SDS-PAGE analysis for the incorporation of 
alkene-bearing lysines 1-9 into myoglobin. AA: no UAA 

was supplemented; +AA: positive control UAA (1 mM); 1-9: 
myoglobin expression in the presence of the corresponding UAA 
(1 mM). 
(TIF) 

Figure S2 Circular dichroism (CD) spectrum of lyso- 
zyme with and without SDS treatment. Blue: lysozyme with 
no SDS; Red: lysozyme with 2% SDS. 
(TIF) 

Scheme SI Synthesis of alkene-bearing lysines 2-6. 

(TIF) 

Scheme S2 Synthesis of alkene-bearing lysine 7. 

(TIF) 

Scheme S3 Synthesis of alkene-bearing lysine 8. 

(TIF) 

Scheme S4 Synthesis of alkene-bearing lysine 9. 

(TIF) 

Scheme S5 Synthesis of dansyl-thiol, 10. 

(TIF) 
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